Rewarding Behaviors
نویسندگان
چکیده
Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solution techniques. But the Markov assumption—that dynamics and rewards depend on the current state only, and not on history— is often inappropriate. This is especially true of rewards: we frequently wish to associate rewards with behaviors that extend over time. Of course, such reward processes can be encoded in an MDP should we have a rich enough state space (where states encode enough history). However it is often difficult to “hand craft” suitable state spaces that encode an appropriate amount of history. We consider this problem in the case where non-Markovian rewards are encoded by assigning values to formulas of a temporal logic. These formulas characterize the value of temporally extended behaviors. We argue that this allows a natural representation of many commonly encountered non-Markovian rewards. The main result is an algorithm which, given a decision process with non-Markovian rewards expressed in this manner, automatically constructs an equivalent MDP (with Markovian reward structure), allowing optimal policy construction using standard techniques.
منابع مشابه
Modulation of brain reward circuitry by leptin.
Leptin, a hormone secreted by fat cells, suppresses food intake and promotes weight loss. To assess the action of this hormone on brain reward circuitry, changes in the rewarding effect of lateral hypothalamic stimulation were measured after leptin administration. At five stimulation sites near the fornix, the effectiveness of the rewarding electrical stimulation was enhanced by chronic food re...
متن کاملEvaluating the neurobiology of sexual reward.
There is much evidence that naturally occurring behaviors (e.g., the ingestion of food and water) and social behaviors (e.g., play, maternal behavior) can induce a reward state. This review includes definitions to distinguish between "reward" and "reinforcement," and a description of methods to assess reward and demonstrate that social interactions can indeed produce a positive affective (PA) s...
متن کاملTiapride prevents the aversive but not the rewarding effect induced by parabrachial electrical stimulation in a place preference task.
The parabrachial complex has been related to the processing of both rewarding and aversive signals. This pontine area is activated after the gastrointestinal administration of rewarding nutrients, in taste aversion learning, and in response to the reinforcing and aversive effects of some drugs of abuse. Electrical stimulation of this region can induce, in different animals, preference or aversi...
متن کاملThe influence of parental practices on child promotive and preventive food consumption behaviors: a systematic review and meta-analysis
BACKGROUND The family is an important social context where children learn and adopt eating behaviors. Specifically, parents play the role of health promoters, role models, and educators in the lives of children, influencing their food cognitions and choices. This study attempts to systematically review empirical studies examining the influence of parents on child food consumption behavior in tw...
متن کاملStrain-dependent differences in corticolimbic processing of aversive or rewarding stimuli
Aberrations in the elaboration of both aversive and rewarding stimuli characterize several psychopathologies including anxiety, depression and addiction. Several studies suggest that different neurotrasmitters, within the corticolimbic system, are critically involved in the processing of positive and negative stimuli. Individual differences in this system, depending on genotype, have been shown...
متن کاملAddiction: Pulling at the Neural Threads of Social Behaviors
Addiction coopts the brain's neuronal circuits necessary for insight, reward, motivation, and social behaviors. This functional overlap results in addicted individuals making poor choices despite awareness of the negative consequences; it explains why previously rewarding life situations and the threat of judicial punishment cannot stop drug taking and why a medical rather than a criminal appro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996